In the initial phase of my research project, Census Tract Selection, I selected six census tracts within Chittenden County—three with high levels of vehicle access, three with low levels—using estimates from the U.S. Census Bureau’s American Community Survey (ACS).

Estimates are accompanied by a margin of error (MOE)—“a measure of the possible variation of the estimate around the population value”—calculated at a 90% confidence level, the Census Bureau’s default. (Fuller & U.S. Census Bureau, 2016)

As discussed in my proposal, this way of grouping creates two distinct categories for the examination and comparison of differences in aesthetic characteristics, and acts as an empirical selection method to avoid bias.

To acquire the skills necessary to appropriately form these groups, I completed the course Analyzing US Census Data in R published by DataCamp, an online learning platform.

Data Collection

In the Data_Collection.R script below, the tidycensus R library is used to gather and organize data on Census Tracts in Chittenden County, VT from the 2022 ACS 5-year estimates for Table B08141: Means of Transportation to Work by Vehicles Available for workers 16 years and over in households.

Within Table B08141, variables B08141_002 (“No vehicle available”), B08141_003 (“1 vehicle available”), B08141_004 (“2 vehicles available”), and B08141_005 (“3 or more vehicles available”) were used to estimate the distribution of vehicle access levels among the universe for each Census Tract.

A summed MOE was calculated using the TidyCensus function moe_sum() and combined with the organized data to render a master datatable: chittenden_county_long_moe

Note: As Census Tract 9800 (BTV - Burlington International Airport) has a population of zero, its estimates were not included.

#load tidycensus, tidyverse, and data.table libraries
library(tidycensus)
library(tidyverse)
library(data.table)
#define variables
vehicle_vars <- c(no_vehicles = "B08141_002", one_vehicle = "B08141_003", two_vehicles = "B08141_004", three_plus_vehicles = "B08141_005")

#call data (ACS 2022 5-year estimate)
chittenden_county <- get_acs(geography = "tract", variables = vehicle_vars, year = 2022, state = "VT", county = "Chittenden County", survey = "acs5")
## Getting data from the 2018-2022 5-year ACS
#delete entries for Census Tract 9800 (rows 161-164)
chittenden_county <- chittenden_county[-c(161:164), ]

#create columns for variables; combine rows of same Census Tract; delete MOE column
chittenden_county_long <- dcast(chittenden_county, GEOID + NAME ~ variable, value.var = c("estimate"))

#reorder rows by GEOID
chittenden_county_long <- arrange(chittenden_county_long, GEOID)

#reorder columns
chittenden_county_long <- chittenden_county_long[,c(1,2,3,4,6,5)]

#create dataframe with summed moe for each Census Tract (grouped by GEOID)
moe <- chittenden_county %>% group_by(GEOID) %>% summarize(MOE_GROUP_CT = moe_sum(moe = moe, estimate = estimate))

#combine chittenden_county_long and moe dataframes
chittenden_county_long_moe <- chittenden_county_long %>% mutate(MOE_GROUP_CT = moe$MOE_GROUP_CT)
B08141_002:5 | # of Vehicles Available (w/ summed MOE)
GEOID NAME no_vehicles one_vehicle two_vehicles three_plus_vehicles MOE_GROUP_CT
50007000100 Census Tract 1; Chittenden County; Vermont 114 537 1270 547 337.7188
50007000200 Census Tract 2; Chittenden County; Vermont 14 721 1688 473 535.2635
50007000300 Census Tract 3; Chittenden County; Vermont 250 1532 880 367 477.3437
50007000600 Census Tract 6; Chittenden County; Vermont 111 327 1147 1086 581.0508
50007000800 Census Tract 8; Chittenden County; Vermont 38 375 805 263 329.7120
50007000900 Census Tract 9; Chittenden County; Vermont 37 379 578 410 397.9987
50007001000 Census Tract 10; Chittenden County; Vermont 86 687 733 33 406.1551
50007001100 Census Tract 11; Chittenden County; Vermont 52 403 500 372 332.1295
50007002101 Census Tract 21.01; Chittenden County; Vermont 0 92 803 867 489.8806
50007002103 Census Tract 21.03; Chittenden County; Vermont 0 179 1434 785 452.6323
50007002104 Census Tract 21.04; Chittenden County; Vermont 0 302 1229 362 397.0000
50007002201 Census Tract 22.01; Chittenden County; Vermont 0 438 205 87 215.7151
50007002202 Census Tract 22.02; Chittenden County; Vermont 53 1106 1533 459 538.6984
50007002301 Census Tract 23.01; Chittenden County; Vermont 0 35 345 330 181.5847
50007002303 Census Tract 23.03; Chittenden County; Vermont 98 564 1299 785 506.2243
50007002304 Census Tract 23.04; Chittenden County; Vermont 17 523 796 294 383.7095
50007002400 Census Tract 24; Chittenden County; Vermont 266 531 1193 259 462.6867
50007002501 Census Tract 25.01; Chittenden County; Vermont 0 280 609 580 467.5746
50007002502 Census Tract 25.02; Chittenden County; Vermont 67 923 307 280 573.1030
50007002601 Census Tract 26.01; Chittenden County; Vermont 188 1037 1510 706 552.9756
50007002602 Census Tract 26.02; Chittenden County; Vermont 66 648 1301 668 398.6427
50007002701 Census Tract 27.01; Chittenden County; Vermont 250 555 1442 989 584.3886
50007002702 Census Tract 27.02; Chittenden County; Vermont 0 279 1806 685 480.3686
50007002800 Census Tract 28; Chittenden County; Vermont 0 369 1293 1164 415.1169
50007002900 Census Tract 29; Chittenden County; Vermont 106 467 1950 1127 445.7970
50007003000 Census Tract 30; Chittenden County; Vermont 0 350 1390 715 346.9597
50007003101 Census Tract 31.01; Chittenden County; Vermont 11 722 2541 1301 711.3044
50007003102 Census Tract 31.02; Chittenden County; Vermont 0 111 489 516 303.2656
50007003301 Census Tract 33.01; Chittenden County; Vermont 19 318 1230 652 360.2291
50007003304 Census Tract 33.04; Chittenden County; Vermont 27 1011 2036 839 761.5314
50007003401 Census Tract 34.01; Chittenden County; Vermont 29 527 1303 1010 552.1322
50007003402 Census Tract 34.02; Chittenden County; Vermont 25 117 613 222 300.6626
50007003501 Census Tract 35.01; Chittenden County; Vermont 23 355 950 838 418.0132
50007003502 Census Tract 35.02; Chittenden County; Vermont 54 363 1712 738 465.9893
50007003503 Census Tract 35.03; Chittenden County; Vermont 0 129 545 355 153.2286
50007003600 Census Tract 36; Chittenden County; Vermont 0 639 2113 367 666.6326
50007003900 Census Tract 39; Chittenden County; Vermont 72 399 380 205 255.2842
50007004002 Census Tract 40.02; Chittenden County; Vermont 35 560 1220 685 474.0612
50007004100 Census Tract 41; Chittenden County; Vermont 123 341 608 340 447.7298
50007004200 Census Tract 42; Chittenden County; Vermont 251 1021 799 784 545.0183

Variable 1: PCT_0

To most accurately assess vehicle access by tract, two scripts—each providing a distinct measure of vehicle access—were assembled. The first of these, PCT_0.R, creates a variable (PCT_0) that represents the estimate for “No vehicle available” as a percentage of the total estimate count for each Census Tract.

As the practical difference between having no vehicle and having one vehicle is much greater than that between having one vehicle and having two, and so forth, PCT_0 is an effective—albeit limited—measure of vehicle access. In other words, the proportion of the population (workers 16 years and over in households) in a given Census Tract who lack access to a vehicle is a relevant indicator of that tract’s overall vehicle access level.

In addition to the creation of the PCT_0 variable, the MOE for each tract’s PCT_0 value is calculated and added to the pct_0 dataframe.

#instalize dataframe for pct_0
pct_0 <- data.frame()

#find sums of counts for each Census Tract
countsum <- rowSums(chittenden_county_long_moe[,3:6])


#create new column in pct_0 with percent of tract count from no_vehicles
for (i in 1:40) {
  pct_0[i,1] = 100 * (chittenden_county_long_moe[i,3] / countsum[i])
  }

#name new column "PCT_0"
names(pct_0) <- "PCT_0"

#add  column for GEOIDs
pct_0 <- pct_0 %>% mutate(GEOID = chittenden_county_long_moe$GEOID)

#add column for Census Tract names
pct_0 <- pct_0 %>% mutate(NAME = chittenden_county_long_moe$NAME)

#reorder columns
pct_0 <- pct_0[,c(2,3,1)]


#create new column in moe (moe for no_vehicles count)
for (i in seq(0,159,by=4)) {
  moe[(i/4)+1,3] = chittenden_county[i+1,5]
  }

#name new column "MOE_0_CT"
colnames(moe)[3] <- "MOE_0_CT"

#create new column in pct_0 with moe for percent of tract count from no_vehicles
for (i in 1:40) {
  pct_0[i,4] = 100 * (moe_prop(num = chittenden_county_long_moe[i,3], denom = countsum[i], moe_num = moe[i,3], moe_denom = chittenden_county_long_moe[i,7]))
  }

#name new column "MOE_0_PCT"
colnames(pct_0)[4] <- "MOE_0_PCT"
PCT_0 | % No Vehicles
GEOID NAME PCT_0 MOE_0_PCT
50007000100 Census Tract 1; Chittenden County; Vermont 4.6191248 4.2072418
50007000200 Census Tract 2; Chittenden County; Vermont 0.4834254 0.6499646
50007000300 Census Tract 3; Chittenden County; Vermont 8.2535490 5.7307751
50007000600 Census Tract 6; Chittenden County; Vermont 4.1557469 3.5172737
50007000800 Census Tract 8; Chittenden County; Vermont 2.5658339 3.8061245
50007000900 Census Tract 9; Chittenden County; Vermont 2.6353276 3.1168517
50007001000 Census Tract 10; Chittenden County; Vermont 5.5880442 3.8186984
50007001100 Census Tract 11; Chittenden County; Vermont 3.9186134 4.2592998
50007002101 Census Tract 21.01; Chittenden County; Vermont 0.0000000 0.5675369
50007002103 Census Tract 21.03; Chittenden County; Vermont 0.0000000 0.4170142
50007002104 Census Tract 21.04; Chittenden County; Vermont 0.0000000 0.5282620
50007002201 Census Tract 22.01; Chittenden County; Vermont 0.0000000 1.3698630
50007002202 Census Tract 22.02; Chittenden County; Vermont 1.6820057 2.5544783
50007002301 Census Tract 23.01; Chittenden County; Vermont 0.0000000 1.4084507
50007002303 Census Tract 23.03; Chittenden County; Vermont 3.5688274 4.3570167
50007002304 Census Tract 23.04; Chittenden County; Vermont 1.0429448 1.2021800
50007002400 Census Tract 24; Chittenden County; Vermont 11.8274789 5.6813816
50007002501 Census Tract 25.01; Chittenden County; Vermont 0.0000000 0.6807352
50007002502 Census Tract 25.02; Chittenden County; Vermont 4.2485732 4.3639585
50007002601 Census Tract 26.01; Chittenden County; Vermont 5.4635280 4.2401886
50007002602 Census Tract 26.02; Chittenden County; Vermont 2.4599329 3.7466572
50007002701 Census Tract 27.01; Chittenden County; Vermont 7.7255871 7.7556096
50007002702 Census Tract 27.02; Chittenden County; Vermont 0.0000000 0.5054152
50007002800 Census Tract 28; Chittenden County; Vermont 0.0000000 0.4953999
50007002900 Census Tract 29; Chittenden County; Vermont 2.9041096 2.9651621
50007003000 Census Tract 30; Chittenden County; Vermont 0.0000000 0.4073320
50007003101 Census Tract 31.01; Chittenden County; Vermont 0.2404372 0.4136147
50007003102 Census Tract 31.02; Chittenden County; Vermont 0.0000000 0.8960573
50007003301 Census Tract 33.01; Chittenden County; Vermont 0.8562416 1.3900933
50007003304 Census Tract 33.04; Chittenden County; Vermont 0.6900077 1.1421456
50007003401 Census Tract 34.01; Chittenden County; Vermont 1.0108052 1.2043281
50007003402 Census Tract 34.02; Chittenden County; Vermont 2.5588536 3.7043290
50007003501 Census Tract 35.01; Chittenden County; Vermont 1.0618652 1.6958812
50007003502 Census Tract 35.02; Chittenden County; Vermont 1.8835019 2.6331180
50007003503 Census Tract 35.03; Chittenden County; Vermont 0.0000000 0.9718173
50007003600 Census Tract 36; Chittenden County; Vermont 0.0000000 0.3206156
50007003900 Census Tract 39; Chittenden County; Vermont 6.8181818 6.4205944
50007004002 Census Tract 40.02; Chittenden County; Vermont 1.4000000 2.2242130
50007004100 Census Tract 41; Chittenden County; Vermont 8.7110482 6.2899116
50007004200 Census Tract 42; Chittenden County; Vermont 8.7915937 4.0817367

Variable 2: wAvg

To account for variations between tracts in categories other than “No vehicles available”, a second variable (wAvg) was calculated to represent the average quantity of vehicles available for each Census Tract.

wAvg was derived by finding the proportion of the total estimate count represented by each category’s estimate, weighting these proportions by their categories’ implied values (ex. “One vehicle available” -> 1), and summing weighted proportions to render a weighted average for each tract.

#instalize dataframe for weighted_avg
weighted_avg <- data.frame()

#create new column in weighted_avg with weighted average of vehicles available by tract
for (i in 1:40) {
  weighted_avg[i,1] = ((chittenden_county_long_moe[i,3] / countsum[i])*0) + ((chittenden_county_long_moe[i,4] / countsum[i])*1) + ((chittenden_county_long_moe[i,5] / countsum[i])*2) + ((chittenden_county_long_moe[i,6] / countsum[i])*3)
}

#name new column "wAvg"
names(weighted_avg) <- "wAvg"

#add GEOIDs and Census Tract names, organize columns
#create new column in weighted_avg with GEOIDs
weighted_avg <- weighted_avg %>% mutate(GEOID = chittenden_county_long_moe$GEOID)

#create new column in weighted_avg with Census Tract names
weighted_avg <- weighted_avg %>% mutate(NAME = chittenden_county_long_moe$NAME)

#reorder columns
weighted_avg <- weighted_avg[,c(2,3,1)]
wAvg | Average # of Vehicles
GEOID NAME wAvg
50007000100 Census Tract 1; Chittenden County; Vermont 1.911669
50007000200 Census Tract 2; Chittenden County; Vermont 1.904696
50007000300 Census Tract 3; Chittenden County; Vermont 1.450314
50007000600 Census Tract 6; Chittenden County; Vermont 2.201048
50007000800 Census Tract 8; Chittenden County; Vermont 1.873059
50007000900 Census Tract 9; Chittenden County; Vermont 1.969373
50007001000 Census Tract 10; Chittenden County; Vermont 1.463288
50007001100 Census Tract 11; Chittenden County; Vermont 1.898267
50007002101 Census Tract 21.01; Chittenden County; Vermont 2.439841
50007002103 Census Tract 21.03; Chittenden County; Vermont 2.252711
50007002104 Census Tract 21.04; Chittenden County; Vermont 2.031696
50007002201 Census Tract 22.01; Chittenden County; Vermont 1.519178
50007002202 Census Tract 22.02; Chittenden County; Vermont 1.761028
50007002301 Census Tract 23.01; Chittenden County; Vermont 2.415493
50007002303 Census Tract 23.03; Chittenden County; Vermont 2.009104
50007002304 Census Tract 23.04; Chittenden County; Vermont 1.838650
50007002400 Census Tract 24; Chittenden County; Vermont 1.642508
50007002501 Census Tract 25.01; Chittenden County; Vermont 2.204221
50007002502 Census Tract 25.02; Chittenden County; Vermont 1.507292
50007002601 Census Tract 26.01; Chittenden County; Vermont 1.794536
50007002602 Census Tract 26.02; Chittenden County; Vermont 1.958256
50007002701 Census Tract 27.01; Chittenden County; Vermont 1.979604
50007002702 Census Tract 27.02; Chittenden County; Vermont 2.146570
50007002800 Census Tract 28; Chittenden County; Vermont 2.281316
50007002900 Census Tract 29; Chittenden County; Vermont 2.122740
50007003000 Census Tract 30; Chittenden County; Vermont 2.148676
50007003101 Census Tract 31.01; Chittenden County; Vermont 2.121749
50007003102 Census Tract 31.02; Chittenden County; Vermont 2.362903
50007003301 Census Tract 33.01; Chittenden County; Vermont 2.133393
50007003304 Census Tract 33.04; Chittenden County; Vermont 1.942244
50007003401 Census Tract 34.01; Chittenden County; Vermont 2.148135
50007003402 Census Tract 34.02; Chittenden County; Vermont 2.056295
50007003501 Census Tract 35.01; Chittenden County; Vermont 2.201754
50007003502 Census Tract 35.02; Chittenden County; Vermont 2.093129
50007003503 Census Tract 35.03; Chittenden County; Vermont 2.219631
50007003600 Census Tract 36; Chittenden County; Vermont 1.912793
50007003900 Census Tract 39; Chittenden County; Vermont 1.679924
50007004002 Census Tract 40.02; Chittenden County; Vermont 2.022000
50007004100 Census Tract 41; Chittenden County; Vermont 1.825071
50007004200 Census Tract 42; Chittenden County; Vermont 1.741156

Vehicle Access Score

Now having created two distinct measures of vehicle access (PCT_0 and wAvg), both variables were normalized and combined to form a index score (Vehicle_Access) that represents each tract’s relative level of vehicle access.

Variables PCT_0 and wAvg were normalized using the min-max scaling method, where data values are scaled between a range of 0 to 1, via the preProcess() and predict() functions in the caret R library. As the value of PCT_0 has a negative relationship with vehicle access, its normalized values were multiplied by -1 when combined with wAvg’s normalized values to form Vehicle_Access.

PCT_0 and wAvg (un-normalized) were then used as primary axes in a scatterplot (created using the ggplot2 and plotly R libraries) where the color of each point (representing a given tract) corresponds to that point’s Vehicle Access Score (Vehicle_Access).

#load ggplot2, plotly, and caret libraries
library(caret)
library(ggplot2)
library(plotly)


#join variable 1 and variable 2 in new datatable
var1_var2_join <- left_join(pct_0, weighted_avg, by = "GEOID")

#remove NAME.y column
var1_var2_join <- var1_var2_join %>% mutate(NAME.y = NULL)

#rename GEOID and NAME.x column
colnames(var1_var2_join)[1] <- "GEOID"
colnames(var1_var2_join)[2] <- "NAME"


#normalize (rescale 0:1) variable 1
PCT_0_process <- preProcess(as.data.frame(var1_var2_join$PCT_0), method=c("range"))
PCT_0_norm <- predict(PCT_0_process, as.data.frame(var1_var2_join$PCT_0))

#normalize (rescale 0:1) variable 2
wAvg_process <- preProcess(as.data.frame(var1_var2_join$wAvg), method=c("range"))
wAvg_norm <- predict(wAvg_process, as.data.frame(var1_var2_join$wAvg))

#create vehicle access score using normalized variables
for(i in 1:40) {
  var1_var2_join[i,6] = (-1 * PCT_0_norm[i,1]) + wAvg_norm[i,1]
}
colnames(var1_var2_join)[6] <- "Vehicle_Access"
Vehicle Access Score
GEOID NAME PCT_0 MOE_0_PCT wAvg Vehicle_Access
50007000100 Census Tract 1; Chittenden County; Vermont 4.6191248 4.2072418 1.911669 0.0756966
50007000200 Census Tract 2; Chittenden County; Vermont 0.4834254 0.6499646 1.904696 0.4183183
50007000300 Census Tract 3; Chittenden County; Vermont 8.2535490 5.7307751 1.450314 -0.6978283
50007000600 Census Tract 6; Chittenden County; Vermont 4.1557469 3.5172737 2.201048 0.4073163
50007000800 Census Tract 8; Chittenden County; Vermont 2.5658339 3.8061245 1.873059 0.2102808
50007000900 Census Tract 9; Chittenden County; Vermont 2.6353276 3.1168517 1.969373 0.3017390
50007001000 Census Tract 10; Chittenden County; Vermont 5.5880442 3.8186984 1.463288 -0.4593513
50007001100 Census Tract 11; Chittenden County; Vermont 3.9186134 4.2592998 1.898267 0.1213796
50007002101 Census Tract 21.01; Chittenden County; Vermont 0.0000000 0.5675369 2.439841 1.0000000
50007002103 Census Tract 21.03; Chittenden County; Vermont 0.0000000 0.4170142 2.252711 0.8108890
50007002104 Census Tract 21.04; Chittenden County; Vermont 0.0000000 0.5282620 2.031696 0.5875351
50007002201 Census Tract 22.01; Chittenden County; Vermont 0.0000000 1.3698630 1.519178 0.0695933
50007002202 Census Tract 22.02; Chittenden County; Vermont 1.6820057 2.5544783 1.761028 0.1717913
50007002301 Census Tract 23.01; Chittenden County; Vermont 0.0000000 1.4084507 2.415493 0.9753942
50007002303 Census Tract 23.03; Chittenden County; Vermont 3.5688274 4.3570167 2.009104 0.2629641
50007002304 Census Tract 23.04; Chittenden County; Vermont 1.0429448 1.2021800 1.838650 0.3042668
50007002400 Census Tract 24; Chittenden County; Vermont 11.8274789 5.6813816 1.642508 -0.8057718
50007002501 Census Tract 25.01; Chittenden County; Vermont 0.0000000 0.6807352 2.204221 0.7618858
50007002502 Census Tract 25.02; Chittenden County; Vermont 4.2485732 4.3639585 1.507292 -0.3016304
50007002601 Census Tract 26.01; Chittenden County; Vermont 5.4635280 4.2401886 1.794536 -0.1140693
50007002602 Census Tract 26.02; Chittenden County; Vermont 2.4599329 3.7466572 1.958256 0.3053332
50007002701 Census Tract 27.01; Chittenden County; Vermont 7.7255871 7.7556096 1.979604 -0.1182972
50007002702 Census Tract 27.02; Chittenden County; Vermont 0.0000000 0.5054152 2.146570 0.7036255
50007002800 Census Tract 28; Chittenden County; Vermont 0.0000000 0.4953999 2.281316 0.8397975
50007002900 Census Tract 29; Chittenden County; Vermont 2.9041096 2.9651621 2.122740 0.4340034
50007003000 Census Tract 30; Chittenden County; Vermont 0.0000000 0.4073320 2.148676 0.7057536
50007003101 Census Tract 31.01; Chittenden County; Vermont 0.2404372 0.4136147 2.121749 0.6582124
50007003102 Census Tract 31.02; Chittenden County; Vermont 0.0000000 0.8960573 2.362903 0.9222479
50007003301 Census Tract 33.01; Chittenden County; Vermont 0.8562416 1.3900933 2.133393 0.6179148
50007003304 Census Tract 33.04; Chittenden County; Vermont 0.6900077 1.1421456 1.942244 0.4387971
50007003401 Census Tract 34.01; Chittenden County; Vermont 1.0108052 1.2043281 2.148135 0.6197445
50007003402 Census Tract 34.02; Chittenden County; Vermont 2.5588536 3.7043290 2.056295 0.3960463
50007003501 Census Tract 35.01; Chittenden County; Vermont 1.0618652 1.6958812 2.201754 0.6696140
50007003502 Census Tract 35.02; Chittenden County; Vermont 1.8835019 2.6331180 2.093129 0.4903703
50007003503 Census Tract 35.03; Chittenden County; Vermont 0.0000000 0.9718173 2.219631 0.7774590
50007003600 Census Tract 36; Chittenden County; Vermont 0.0000000 0.3206156 1.912793 0.4673735
50007003900 Census Tract 39; Chittenden County; Vermont 6.8181818 6.4205944 1.679924 -0.3444289
50007004002 Census Tract 40.02; Chittenden County; Vermont 1.4000000 2.2242130 2.022000 0.4593683
50007004100 Census Tract 41; Chittenden County; Vermont 8.7110482 6.2899116 1.825071 -0.3577859
50007004200 Census Tract 42; Chittenden County; Vermont 8.7915937 4.0817367 1.741156 -0.4493990
#graph pct_0 vs wAvg (w/ Vehicle_Access)
var1_var2 <- ggplot(var1_var2_join, aes(x = PCT_0, y = wAvg, color = Vehicle_Access, label = NAME)) + geom_point(size = 2) + labs(x = "% No Vehicles", y = "Average # of Vehicles", color = "Vehicle Access Score", title = "Measures of Vehicle Access by Census Tract") + theme(margin(l = 5))

ggplotly(var1_var2) %>%
  layout(margin = list(r = 15), title = list(text = paste0('Measures of Vehicle Access by Census Tract in Chittenden County, VT',
                                    '<br>',
                                    '<sup>',
                                    'Data source: 2017-2022 ACS. Data acquired with the R tidycensus package.','</sup>')))

Selecting Tracts

With the vehicle access of Chittenden County, VT Census Tracts graphically represented, outliers on either extreme may be qualitatively selected To ensure the accurate selection of high and low vehicle access groups, however, some further steps are necessary.

First, the base R function quantitle() was used to identify datapoints in the upper and lower 10% of the distribution of Vehicle_Access values.

#use quantile() function to select tracts (high Vehicle_Access & low Vehicle_Access)

#upper 10% of distribution (High)
var1_var2_join[which(var1_var2_join$Vehicle_Access > quantile(var1_var2_join$Vehicle_Access,.9)),]
##          GEOID                                           NAME PCT_0 MOE_0_PCT
## 9  50007002101 Census Tract 21.01; Chittenden County; Vermont     0 0.5675369
## 14 50007002301 Census Tract 23.01; Chittenden County; Vermont     0 1.4084507
## 24 50007002800    Census Tract 28; Chittenden County; Vermont     0 0.4953999
## 28 50007003102 Census Tract 31.02; Chittenden County; Vermont     0 0.8960573
##        wAvg Vehicle_Access
## 9  2.439841      1.0000000
## 14 2.415493      0.9753942
## 24 2.281316      0.8397975
## 28 2.362903      0.9222479
#lower 10% of distribution (Low)
var1_var2_join[which(var1_var2_join$Vehicle_Access < quantile(var1_var2_join$Vehicle_Access,.1)),]
##          GEOID                                        NAME     PCT_0 MOE_0_PCT
## 3  50007000300  Census Tract 3; Chittenden County; Vermont  8.253549  5.730775
## 7  50007001000 Census Tract 10; Chittenden County; Vermont  5.588044  3.818698
## 17 50007002400 Census Tract 24; Chittenden County; Vermont 11.827479  5.681382
## 40 50007004200 Census Tract 42; Chittenden County; Vermont  8.791594  4.081737
##        wAvg Vehicle_Access
## 3  1.450314     -0.6978283
## 7  1.463288     -0.4593513
## 17 1.642508     -0.8057718
## 40 1.741156     -0.4493990

This process yielded two groups (high and low vehicle access) of 4 Census Tracts based on their Vehicle Access Score (Vehicle_Access). However, these groups must ultimately contain only 3 tracts each.

To further refine the selection process, a condition was set that the estimate counts for all categories in a given tract must exceed those categories’ corresponding MOEs. This condition was formulated based on the Chittenden County Regional Planning Commission’s (CCRPC) 2018 guide, Best Practices for Reporting American Community Survey in Municipal Planning, which recommends avoiding the use of data where “the MOE for [a] population is higher than the estimate itself” (CCRPC, 2018).

Using the chittenden_county dataframe, the condition was checked for the 8 tracts selected in the previous step, with “YES” or “NO” printed for each category in each tract.

Note: As all tracts in the high vehicle access group had an estimate of 0 for the “No vehicle available” category (resulting in a MOE of 10), the above condition was only assessed on the other three categories for these tracts.

#check that estimate > MOE for selected tracts (using chittenden_county)

#Low group
#Census Tract 3
for(i in 9:12) {
  if(chittenden_county$moe[i] < chittenden_county$estimate[i]) {
    print("YES")
  }
  else if(chittenden_county$moe[i] >= chittenden_county$estimate[i]) {
    print("NO")
  }
}
## [1] "YES"
## [1] "YES"
## [1] "YES"
## [1] "YES"
#Census Tract 10
for(i in 25:28) {
  if(chittenden_county$moe[i] < chittenden_county$estimate[i]) {
    print("YES")
  }
  else if(chittenden_county$moe[i] >= chittenden_county$estimate[i]) {
    print("NO")
  }
}
## [1] "YES"
## [1] "YES"
## [1] "YES"
## [1] "NO"
#Census Tract 24
for(i in 65:68) {
  if(chittenden_county$moe[i] < chittenden_county$estimate[i]) {
    print("YES")
  }
  else if(chittenden_county$moe[i] >= chittenden_county$estimate[i]) {
    print("NO")
  }
}
## [1] "YES"
## [1] "YES"
## [1] "YES"
## [1] "YES"
#Census Tract 42
for(i in 157:160) {
  if(chittenden_county$moe[i] < chittenden_county$estimate[i]) {
    print("YES")
  }
  else if(chittenden_county$moe[i] >= chittenden_county$estimate[i]) {
    print("NO")
  }
}
## [1] "YES"
## [1] "YES"
## [1] "YES"
## [1] "YES"
#High group (ignore no_vehicles variable)
#Census Tract 21.01
for(i in 34:36) {
  if(chittenden_county$moe[i] < chittenden_county$estimate[i]) {
    print("YES")
  }
  else if(chittenden_county$moe[i] >= chittenden_county$estimate[i]) {
    print("NO")
  }
}
## [1] "YES"
## [1] "YES"
## [1] "YES"
#Census Tract 23.01
for(i in 54:56) {
  if(chittenden_county$moe[i] < chittenden_county$estimate[i]) {
    print("YES")
  }
  else if(chittenden_county$moe[i] >= chittenden_county$estimate[i]) {
    print("NO")
  }
}
## [1] "YES"
## [1] "YES"
## [1] "YES"
#Census Tract 28
for(i in 94:96) {
  if(chittenden_county$moe[i] < chittenden_county$estimate[i]) {
    print("YES")
  }
  else if(chittenden_county$moe[i] >= chittenden_county$estimate[i]) {
    print("NO")
  }
}
## [1] "YES"
## [1] "YES"
## [1] "YES"
#Census Tract 31.02
for(i in 110:112) {
  if(chittenden_county$moe[i] < chittenden_county$estimate[i]) {
    print("YES")
  }
  else if(chittenden_county$moe[i] >= chittenden_county$estimate[i]) {
    print("NO")
  }
}
## [1] "NO"
## [1] "YES"
## [1] "YES"

As Census Tract 10 and Census Tract 31.02 possessed MOEs in excess of corresponding estimate counts, both tracts were eliminated from their respective groups, leaving 3 tracts in each group:

Low Vehicle Access Tracts:

High Vehicle Access Tracts:

While the tracts that remain have each met the Estimate > MOE condition, the statistical reliability of their data must be further assessed prior to concluding the selection process.

To do so, the mean coefficients of variation (CV) for selected tracts were calculated according to Census Bureau (2018) guidelines, and results were organized in a new dataframe, cv.

Note: As all tracts in the high vehicle access group had an estimate of 0 for the “No vehicle available” category (resulting in a MOE of 10), mean CV was calculated from the other three variables’ estimate and MOEs for these tracts.

#find CV for selected tracts' variables

#Low
cv24 <- c((((chittenden_county$moe[65] / 1.645) / chittenden_county$estimate[65]) * 100), 
          (((chittenden_county$moe[66] / 1.645) / chittenden_county$estimate[66]) * 100),
          (((chittenden_county$moe[67] / 1.645) / chittenden_county$estimate[67]) * 100),
          (((chittenden_county$moe[68] / 1.645) / chittenden_county$estimate[68]) * 100)
          )

cv3 <- c((((chittenden_county$moe[9] / 1.645) / chittenden_county$estimate[9]) * 100), 
          (((chittenden_county$moe[10] / 1.645) / chittenden_county$estimate[10]) * 100),
          (((chittenden_county$moe[11] / 1.645) / chittenden_county$estimate[11]) * 100),
          (((chittenden_county$moe[12] / 1.645) / chittenden_county$estimate[12]) * 100)
          )

cv42 <- c((((chittenden_county$moe[157] / 1.645) / chittenden_county$estimate[157]) * 100), 
         (((chittenden_county$moe[158] / 1.645) / chittenden_county$estimate[158]) * 100),
         (((chittenden_county$moe[159] / 1.645) / chittenden_county$estimate[159]) * 100),
         (((chittenden_county$moe[160] / 1.645) / chittenden_county$estimate[160]) * 100)
         )

#High (w/o no_vehicles (where estimate = 0))
cv21.01 <- c((((chittenden_county$moe[34] / 1.645) / chittenden_county$estimate[34]) * 100),
          (((chittenden_county$moe[35] / 1.645) / chittenden_county$estimate[35]) * 100),
          (((chittenden_county$moe[36] / 1.645) / chittenden_county$estimate[36]) * 100)
)

cv23.01 <- c((((chittenden_county$moe[54] / 1.645) / chittenden_county$estimate[54]) * 100),
         (((chittenden_county$moe[55] / 1.645) / chittenden_county$estimate[55]) * 100),
         (((chittenden_county$moe[56] / 1.645) / chittenden_county$estimate[56]) * 100)
)

cv28 <- c((((chittenden_county$moe[94] / 1.645) / chittenden_county$estimate[94]) * 100),
          (((chittenden_county$moe[95] / 1.645) / chittenden_county$estimate[95]) * 100),
          (((chittenden_county$moe[96] / 1.645) / chittenden_county$estimate[96]) * 100)
)

#group CV means for selected tracts
cv_low <- c(mean(cv24), mean(cv3), mean(cv42))
cv_high <- c(mean(cv21.01), mean(cv23.01), mean(cv28))


#create dataframe for selected tracts' CVs
cv <- data.frame()

for(i in 1:6){ if(i < 4){
  cv[i,1] = cv_low[i]
} else if(i >= 4){
  cv[i,1] = cv_high[i-3]
  }
}

#add tract names, name columns
cv <- cv %>% 
  mutate(NAME = c("Census Tract 24; Chittenden County; Vermont", "Census Tract 3; Chittenden County; Vermont", "Census Tract 42; Chittenden County; Vermont", "Census Tract 21.01; Chittenden County; Vermont", "Census Tract 23.01; Chittenden County; Vermont", "Census Tract 28; Chittenden County; Vermont")) %>%
  rename("CV" = V1)

#reorder columns
cv <- cv[,c(2,1)]
Mean CV for Selected Census Tracts
NAME CV
Census Tract 24; Chittenden County; Vermont 27.15444
Census Tract 3; Chittenden County; Vermont 25.76912
Census Tract 42; Chittenden County; Vermont 23.94867
Census Tract 21.01; Chittenden County; Vermont 28.52932
Census Tract 23.01; Chittenden County; Vermont 30.80666
Census Tract 28; Chittenden County; Vermont 19.89782

Mean CVs for selected tracts can now be compared with CCRPC (2018) guidelines for assessing the statistical reliability of ACS data, which are as follows:

Although the mean CV for Census Tract 23.01 slightly exceeds 30%, thus falling into the “Low Reliability” category, mean CVs for selected tracts are generally between 15% and 30%. This indicates a Medium to Medium-Low level of statistical reliability, which given Vermont’s low population density (and therefore relatively high MOE for ACS data), is usable in the context of this project’s Census Tract Selection phase.

Note: Issues w/ MOE

As noted by the CCRPC (2018), the American Community Survey provides estimates that reflect a community’s social and economic conditions. An estimate is “NOT an official count of the population nor is it a point in time count” (CCRPC, 2018). Therefore, it is extremely important to consider MOE when using ACS data to inform decision making.

To reflect this inherent uncertainty in the data, I initially sought to calculate wAvg in a manner that accounts for MOE. To do so, I found each tract’s wAvg using estimates alone, working as if the data were exact. Then, I tried to find each tract’s wAvg with MOE (wAvg_MOE) added to estimates prior to weighting and summing. By finding the differences between wAvg_MOE and wAvg, I was attempting to determine the MOE—correctly propagated—for wAvg, with which a vertical error bar could be displayed on a graph plotting PCT_0 and wAvg.

Example:

#weighted average (+ moe)
#calculate sum of count and moe for each variable
  #for no_vehicles
for(i in 1:40) {
  moe[i,7] = chittenden_county_long_moe[i,3] + moe[i,3]
}

colnames(moe)[7] <- "CountMOE_Sum_0"

  #for one_vehicle
for(i in 1:40) {
  moe[i,8] = chittenden_county_long_moe[i,4] + moe[i,4]
}

colnames(moe)[8] <- "CountMOE_Sum_1"

  #for two_vehicles
for(i in 1:40) {
  moe[i,9] = chittenden_county_long_moe[i,5] + moe[i,5]
}

colnames(moe)[9] <- "CountMOE_Sum_2"

  #for three_plus_vehicles
for(i in 1:40) {
  moe[i,10] = chittenden_county_long_moe[i,6] + moe[i,6]
}

colnames(moe)[10] <- "CountMOE_Sum_3+"


#calculate sum of countsum and moesum (count + moe rowsums)
countMOE_rowsum <-rowSums(moe[,7:10])


#find weighted average (+ moe)
for (i in 1:40) {
  weighted_avg[i,2] = ((moe[i,7] / countMOE_rowsum[i])*0) + ((moe[i,8] / countMOE_rowsum[i])*1) + ((moe[i,9] / countMOE_rowsum[i])*2) + ((moe[i,10] / countMOE_rowsum[i])*3)
}

#find difference between weighted average and weighted average (+ moe)
for (i in 1:40) {
  weighted_avg[i,3] = (weighted_avg[i,2] - weighted_avg[i,1])
}

However, a significant proportion of these differences were a negative value:

As MOEs had been added to estimates, not subtracted, this indicated an issue with either my approach or my data.

Upon examining the original data for tract where wAvg_MOE - wAvg were most negative in value, I found that the majority of tracts with negative differences contained a variable whose MOE exceeded its estimate. Vermont’s low population density, even within its most densely populated county, could be a significant factor in explaining why these MOEs were so large.

The relationship between estimate size and its corresponding MOE in the data can be observed in the graph below:

Compatible with the distribution of differences shown earlier, I determined that 21.875% of estimates had MOEs greater than their value. Further, these instances tended to be widely spread among tracts, not isolated in a problematic few.

Because I discovered this issue towards the end of the time I had allotted for the Census Tract Selection phase, as well as the lack of suitable alternative ACS data, I decided to proceed with the data I had already collected while making a few changes to my approach.

Rather than incoporating MOEs into the wAvg variable, I chose to ignore MOE until Vehicle_Access was calculated and tracts in the upper and lower 10% of the distribution were identified. Then, I evaluated the MOE for both groups’ tracts by comparing their mean CVs with CCRPC guidelines, as detailed in the “Selecting Tracts” section above.

In this way, I was able to make informed (accounting for MOE) selections of Census Tracts while maintaining my original project timeline.

References:

Chittenden County Regional Planning Commission (CCRPC). (2018). Best Practices for Reporting American Community Survey in Municipal Planning. https://www.ccrpcvt.org/wp-content/uploads/2018/10/ACS_Guide_Final_20181003.pdf

Fuller, S. & U.S. Census Bureau. (2016). Using ACS Estimates and Margins of Error. https://www.census.gov/content/dam/Census/programs-surveys/acs/guidance/training-presentations/2016_MOE_Slides_01.pdf

U.S. Census Bureau. (2018). 8. Calculating Measures of Error for Derived Estimates. https://www.census.gov/content/dam/Census/library/publications/2018/acs/acs_general_handbook_2018_ch08.pdf

U.S. Census Bureau. (2022). Means of Transportation to Work by Vehicles Available. American Community Survey, ACS 5-Year Estimates Detailed Tables, Table B08141. Retrieved June 17, 2024, from https://data.census.gov/table/ACSDT5Y2022.B08141?q=B08141: MEANS OF TRANSPORTATION TO WORK BY VEHICLES AVAILABLE&g=050XX00US50007$1400000.